Goto

Collaborating Authors

 Vellore


KAN-AFT: An Interpretable Nonlinear Survival Model Integrating Kolmogorov-Arnold Networks with Accelerated Failure Time Analysis

Jose, Mebin, Francis, Jisha, Kattumannil, Sudheesh Kumar

arXiv.org Machine Learning

Survival analysis relies fundamentally on the semi-parametric Cox Proportional Hazards (CoxPH) model and the parametric Accelerated Failure Time (AFT) model. CoxPH assumes constant hazard ratios, often failing to capture real-world dynamics, while traditional AFT models are limited by rigid distributional assumptions. Although deep learning models like DeepAFT address these constraints by improving predictive accuracy and handling censoring, they inherit the significant challenge of black-box interpretability. The recent introduction of CoxKAN demonstrated the successful integration of Kolmogorov-Arnold Networks (KANs), a novel architecture that yields highly accurate and interpretable symbolic representations, within the CoxPH framework. Motivated by the interpretability gains of CoxKAN, we introduce KAN-AFT (Kolmogorov Arnold Network-based AFT), the first framework to apply KANs to the AFT model. Our primary contributions include: (i) a principled AFT-KAN formulation, (ii) robust optimization strategies for right-censored observations (e.g., Buckley-James and IPCW), and (iii) an interpretability pipeline that converts the learned spline functions into closed-form symbolic equations for survival time. Empirical results on multiple datasets confirm that KAN-AFT achieves performance comparable to or better than DeepAFT, while uniquely providing transparent, symbolic models of the survival process.


A Comprehensive Performance Comparison of Traditional and Ensemble Machine Learning Models for Online Fraud Detection

Khekare, Ganesh, Sunda, Shivam, Bothra, Yash

arXiv.org Artificial Intelligence

In the era of the digitally driven economy, where there has been an exponential surge in digital payment systems and other online activities, various forms of fraudulent activities have accompanied the digital growth, out of which credit card fraud has become an increasingly significant threat. To deal with this, real-time fraud detection is essential for financial security but remains challenging due to high transaction volumes and the complexity of modern fraud patterns. This study presents a comprehensive performance comparison between traditional machine learning models like Random Forest, SVM, Logistic Regression, XGBoost, and ensemble methods like Stacking and Voting Classifier for detecting credit card fraud on a heavily imbalanced public dataset, where the number of fraudulent transactions is 492 out of 284,807 total transactions. Application-specific preprocessing techniques were applied, and the models were evaluated using various performance metrics. The ensemble methods achieved an almost perfect precision of around 0.99, but traditional methods demonstrated superior performance in terms of recall, which highlights the trade-off between false positives and false negatives. The comprehensive comparison reveals distinct performance strengths and limitations for each algorithm, offering insights to guide practitioners in selecting the most effective model for robust fraud detection applications in real-world settings.


The Impact of Artificial Intelligence on Traditional Art Forms: A Disruption or Enhancement

Marella, Viswa Chaitanya, Erukude, Sai Teja, Veluru, Suhasnadh Reddy

arXiv.org Artificial Intelligence

The introduction of Artificial Intelligence (AI) into the domains of traditional art (visual arts, performing arts, and crafts) has sparked a complicated discussion about whether this might be an agent of disruption or an enhancement of our traditional art forms. This paper looks at the duality of AI, exploring the ways that recent technologies like Generative Adversarial Networks and Diffusion Models, and text-to-image generators are changing the fields of painting, sculpture, calligraphy, dance, music, and the arts of craft. Using examples and data, we illustrate the ways that AI can democratize creative expression, improve productivity, and preserve cultural heritage, while also examining the negative aspects, including: the threats to authenticity within art, ethical concerns around data, and issues including socio-economic factors such as job losses. While we argue for the context-dependence of the impact of AI (the potential for creative homogenization and the devaluation of human agency in artmaking), we also illustrate the potential for hybrid practices featuring AI in cuisine, etc. We advocate for the development of ethical guidelines, collaborative approaches, and inclusive technology development. In sum, we are articulating a vision of AI in which it amplifies our innate creativity while resisting the displacement of the cultural, nuanced, and emotional aspects of traditional art. The future will be determined by human choices about how to govern AI so that it becomes a mechanism for artistic evolution and not a substitute for the artist's soul.


Mentalic Net: Development of RAG-based Conversational AI and Evaluation Framework for Mental Health Support

Dutta, Anandi, Mruthyunjaya, Shivani, Saddington, Jessica, Islam, Kazi Sifatul

arXiv.org Artificial Intelligence

The emergence of large language models (LLMs) has unlocked boundless possibilities, along with significant challenges. In response, we developed a mental health support chatbot designed to augment professional healthcare, with a strong emphasis on safe and meaningful application. Our approach involved rigorous evaluation, covering accuracy, empathy, trustworthiness, privacy, and bias. We employed a retrieval-augmented generation (RAG) framework, integrated prompt engineering, and fine-tuned a pre-trained model on novel datasets. The resulting system, Mentalic Net Conversational AI, achieved a BERT Score of 0.898, with other evaluation metrics falling within satisfactory ranges. We advocate for a human-in-the-loop approach and a long-term, responsible strategy in developing such transformative technologies, recognizing both their potential to change lives and the risks they may pose if not carefully managed.


ART: Adaptive Resampling-based Training for Imbalanced Classification

Basandrai, Arjun, Jain, Shourya, Ilanthenral, K.

arXiv.org Machine Learning

Traditional resampling methods for handling class imbalance typically uses fixed distributions, undersampling the majority or oversampling the minority. These static strategies ignore changes in class-wise learning difficulty, which can limit the overall performance of the model. This paper proposes an Adaptive Resampling-based Training (ART) method that periodically updates the distribution of the training data based on the class-wise performance of the model. Specifically, ART uses class-wise macro F1 scores, computed at fixed intervals, to determine the degree of resampling to be performed. Unlike instance-level difficulty modeling, which is noisy and outlier-sensitive, ART adapts at the class level. This allows the model to incrementally shift its attention towards underperforming classes in a way that better aligns with the optimization objective. Results on diverse benchmarks, including Pima Indians Diabetes and Yeast dataset demonstrate that ART consistently outperforms both resampling-based and algorithm-level methods, including Synthetic Minority Oversampling Technique (SMOTE), NearMiss Undersampling, and Cost-sensitive Learning on binary as well as multi-class classification tasks with varying degrees of imbalance. In most settings, these improvements are statistically significant. On tabular datasets, gains are significant under paired t-tests and Wilcoxon tests (p < 0.05), while results on text and image tasks remain favorable. Compared to training on the original imbalanced data, ART improves macro F1 by an average of 2.64 percentage points across all tested tabular datasets. Unlike existing methods, whose performance varies by task, ART consistently delivers the strongest macro F1, making it a reliable choice for imbalanced classification.


Why Stop at Words? Unveiling the Bigger Picture through Line-Level OCR

Vempati, Shashank, Anand, Nishit, Talebailkar, Gaurav, Garai, Arpan, Arora, Chetan

arXiv.org Artificial Intelligence

Conventional optical character recognition (OCR) techniques segmented each character and then recognized. This made them prone to error in character segmentation, and devoid of context to exploit language models. Advances in sequence to sequence translation in last decade led to modern techniques first detecting words and then inputting one word at a time to a model to directly output full words as sequence of characters. This allowed better utilization of language models and bypass error-prone character segmentation step. We observe that the above transition in style has moved the bottleneck in accuracy to word segmentation. Hence, in this paper, we propose a natural and logical progression from word level OCR to line-level OCR. The proposal allows to bypass errors in word detection, and provides larger sentence context for better utilization of language models. We show that the proposed technique not only improves the accuracy but also efficiency of OCR. Despite our thorough literature survey, we did not find any public dataset to train and benchmark such shift from word to line-level OCR. Hence, we also contribute a meticulously curated dataset of 251 English page images with line-level annotations. Our experimentation revealed a notable end-to-end accuracy improvement of 5.4%, underscoring the potential benefits of transitioning towards line-level OCR, especially for document images. We also report a 4 times improvement in efficiency compared to word-based pipelines. With continuous improvements in large language models, our methodology also holds potential to exploit such advances. Project Website: https://nishitanand.github.io/line-level-ocr-website


Current State in Privacy-Preserving Text Preprocessing for Domain-Agnostic NLP

Sinha, Abhirup, Saha, Pritilata, Saha, Tithi

arXiv.org Artificial Intelligence

Privacy is a fundamental human right. Data privacy is protected by different regulations, such as GDPR. However, modern large language models require a huge amount of data to learn linguistic variations, and the data often contains private information. Research has shown that it is possible to extract private information from such language models. Thus, anonymizing such private and sensitive information is of utmost importance. While complete anonymization may not be possible, a number of different pre-processing approaches exist for masking or pseudonymizing private information in textual data. This report focuses on a few of such approaches for domain-agnostic NLP tasks.


VocalEyes: Enhancing Environmental Perception for the Visually Impaired through Vision-Language Models and Distance-Aware Object Detection

Chavan, Kunal, Balaji, Keertan, Barigidad, Spoorti, Chiluveru, Samba Raju

arXiv.org Artificial Intelligence

With an increasing demand for assistive technologies that promote the independence and mobility of visually impaired people, this study suggests an innovative real-time system that gives audio descriptions of a user's surroundings to improve situational awareness. The system acquires live video input and processes it with a quantized and fine-tuned Florence-2 big model, adjusted to 4-bit accuracy for efficient operation on low-power edge devices such as the NVIDIA Jetson Orin Nano. By transforming the video signal into frames with a 5-frame latency, the model provides rapid and contextually pertinent descriptions of objects, pedestrians, and barriers, together with their estimated distances. The system employs Parler TTS Mini, a lightweight and adaptable Text-to-Speech (TTS) solution, for efficient audio feedback. It accommodates 34 distinct speaker types and enables customization of speech tone, pace, and style to suit user requirements. This study examines the quantization and fine-tuning techniques utilized to modify the Florence-2 model for this application, illustrating how the integration of a compact model architecture with a versatile TTS component improves real-time performance and user experience. The proposed system is assessed based on its accuracy, efficiency, and usefulness, providing a viable option to aid vision-impaired users in navigating their surroundings securely and successfully.


An Improved Rapidly Exploring Random Tree Algorithm for Path Planning in Configuration Spaces with Narrow Channels

Noel, Mathew Mithra, Chawla, Akshay

arXiv.org Artificial Intelligence

Rapidly-exploring Random Tree (RRT) algorithms have been applied successfully to challenging robot motion planning and under-actuated nonlinear control problems. However a fundamental limitation of the RRT approach is the slow convergence in configuration spaces with narrow channels because of the small probability of generating test points inside narrow channels. This paper presents an improved RRT algorithm that takes advantage of narrow channels between the initial and goal states to find shorter paths by improving the exploration of narrow regions in the configuration space. The proposed algorithm detects the presence of narrow channel by checking for collision of neighborhood points with the infeasible set and attempts to add points within narrow channels with a predetermined bias. This approach is compared with the classical RRT and its variants on a variety of benchmark planning problems. Simulation results indicate that the algorithm presented in this paper computes a significantly shorter path in spaces with narrow channels.


Self-DenseMobileNet: A Robust Framework for Lung Nodule Classification using Self-ONN and Stacking-based Meta-Classifier

Rahman, Md. Sohanur, Chowdhury, Muhammad E. H., Rahman, Hasib Ryan, Ahmed, Mosabber Uddin, Kabir, Muhammad Ashad, Roy, Sanjiban Sekhar, Sarmun, Rusab

arXiv.org Artificial Intelligence

In this study, we propose a novel and robust framework, Self-DenseMobileNet, designed to enhance the classification of nodules and non-nodules in chest radiographs (CXRs). Our approach integrates advanced image standardization and enhancement techniques to optimize the input quality, thereby improving classification accuracy. To enhance predictive accuracy and leverage the strengths of multiple models, the prediction probabilities from Self-DenseMobileNet were transformed into tabular data and used to train eight classical machine learning (ML) models; the top three performers were then combined via a stacking algorithm, creating a robust meta-classifier that integrates their collective insights for superior classification performance. To enhance the interpretability of our results, we employed class activation mapping (CAM) to visualize the decision-making process of the best-performing model. Our proposed framework demonstrated remarkable performance on internal validation data, achieving an accuracy of 99.28\% using a Meta-Random Forest Classifier. When tested on an external dataset, the framework maintained strong generalizability with an accuracy of 89.40\%. These results highlight a significant improvement in the classification of CXRs with lung nodules.